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Single-Chain Antigen-Binding Proteins 

Robert E. Bird,* Karl D. Hardman, James W. Jacobson, 

Syd Johnson, Bennett M. Kaufman, Shwu-Maan Lee, 

Timothy Lee, Sharon H. Pope, Gary S. Riordan, Marc Whitlow 

Single-chain antigen-binding proteins are novel recombinant polypeptides, composed 
of an antibody variable light-chain amino acid sequence (V L ) tethered to a variable 
heavy-chain sequence (V H ) by a designed peptide that links the carboxyl tenninus of 
the V L sequence to the amino terminus of the V H sequence. These proteins have the 
same specificities and affinities for their antigens as the monoclonal antibodies whose 
V L and V H sequences were used to construct the recombinant genes that were 
expressed in Escherichia coli. Three of these proteins, one derived from the sequence for 
a monoclonal antibody to growth hormone and two derived from the sequences of two 
different monoclonal antibodies to fluorescein, were designed, constructed, synthe- 
sized, purified, and assayed. These proteins are expected to have significant advantages 
over monoclonal antibodies in a number of applications. 



TYPICAL ANTIBODIES ARE COMPOSED 
of four polypeptide chains, two light 
chains of about 220 amino acid resi- 
dues and two heavy chains of about 440 
amino acids (1). These chains fold into 
domains of approximately 1 10 amino acids, 
assuming a conserved three-dimensional 
conformation (2). The domains associate to 
form discrete structural regions. The antigen 
binding or variable regions are formed by 
the interaction of the variable light ( V L ) and 
variable heavy (V H ) domains at the amino 



termini of the chains. The first constant 
regions are formed by interaction of the 
remainder of the light chain and the first 
constant domain of the heavy chain. Two or 
three additional constant regions are formed 
by interaction of the two heavy chains. The 
heavy chain constant regions are responsible 
for effector functions, such as complement 
fixation and binding to receptors (3). 

The high background present when intact 
antibodies have been used to image tumors 
is in large part due to the binding of the 



antibody to nontumor cells via the heavy 
chain constant regions (4). When Fab frag- 
ments, consisting of the variable region and 
the first constant region, have been used the 
background problem has been partially cir- 
cumvented (4). Therefore, a better molecule 
to use as an imaging or delivery agent would 
be the Fv fragment, which consists of only 
the V L and Vh domains. Unfortunately, 
there have been few reports of the successful 
isolation of Fv fragments by proteolytic 
digestion of intact antibody molecules (5). 

The idea for the design and synthesis of 
single-chain antigen- binding proteins was 
conceived during attempts to circumvent 
problems encountered when expressing 
antibody genes in Escherichia coli and to 
avoid problems associated with reassocia- 
tion of Fv fragments. These proteins consist 
of the V L and V H sequences synthesized as a 
single polypeptide chain, with the carboxyl 
terminus of the V L linked by a designed 
peptide to the amino terminus of the V H - 
Both chains are therefore expressed in equi- 
molar concentrations, and the covalent link- 
ing of the two chains facilitates the associa- 
tion of the V L and V H domains after fold- 
ing. 

Determination of the three-dimensional 
structures of antibody fragments by x-ray 
crystallography has led to the realization 
that variable domains are each folded into a 
characteristic structure composed of nine 
strands of closely packed (S-sheets. This 
structure is maintained despite sequence 
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Fig. 1. An alpha-carbon model of the anti-BGH 
3C2/59 single-chain antigen-binding protein. The 
Vfc is red, the V H is blue, and the linker is green. 
The model was generated by superimposing the 
sequences of the 3C2 V L and V H domains of the 
single-chain antigen-binding protein onto the 
structure of the variable region of MCPC603 (9). 
The sequence of the protein depicted is MENV 
LTQSPAIMSASPGEKVTMTCRASSSVS 
SSYLHWFQQKSGASPKLWIYSTSNLA 
SGVPARFSGSGSGTSYSLTISSVEAEDA 
ATYYCQQYSGYPLTFGAGTKLKESGS 
VSSEQLAQFRSLD VQLVESGGDLVKP 
GGSLKLSCAASGFTFISYGMSWVRQT 
PDKRLEWVATISSGSTYTYYPDSVKG 
RFTISRDNAKNTLYLQMSGLKSEDTA 
MYYCARRITTVVLTDYYAMDYWGQ 
GTSVTVS; with the linker region underlined. 
Abbreviations for the amino acid residues are as 
follows: A, Ala; C, Cys; D, Asp; E, Glu; F, 
Phe; G, Gly; H, His; I, He; K, Lys; L, Leu; M, 
Met; N, Asn; P, Pro; Q, Gin; R, Arg; S, Ser; 
T, Thr; V, Val; W, Trp; and Y, Tyr. The linker, 
which in this case is derived from a segment of human carbonic anhy- 
drase, spans from residue 105 of the V L to residue 2 of the V H with the 
use of the Kabat numbering system (7). The carboxyl terminus corre- 
sponds to residue 116 of the V H , after which a stop codon was intro- 
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duced. A methionine was inserted in front of the V L for expression in E. 
coli. The single-chain antigen- binding proteins have been designated by 
the monoclonal name (for example, 3C2) with the number of the de- 
signed linker sequence after the diagonal mark (for example, 3C2/59). 
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variation in the V L and V H domains (6). 
Analysis of antibody primary sequence data 
has established the existence of two classes 
of variable region sequence: hypervariable 
sequences and framework sequences (7). 
t j [j , The framework sequences are responsible 
for the correct p-sheet folding of the V L and 
Vh domains and for the interchain interac- 
tions that bring domains together. Each 
variable domain contains three hypervaria- 
ble sequences, which appear as loops at one 
end of the p-sheets. The six hypervariable 
sequences of the variable region, three from 
the V L and three from the V H , form the 
antigen- binding site and are therefore 
named the complementarity determining re- 
gions (CDRs). 

The design of the single-chain antigen- 
binding protein was based on the assump- 
tion that the molecular interactions respon- 
sible for the conserved structure, determined 
by the framework sequences, would assure 
that proper folding of the individual V L and 
V H domains would occur when tethered by a 
short peptide linker. Linkers of different 
designs have been used to join the Vl and 
V H sequences. 

A computer-assisted method (8) was used 
in designing the first group of linkers. Since 
the variable domains of antibodies appear to 
have homologous three-dimensional struc- 
tures, we based our modeling on the previ- 
ously published structure of the Fab frag- 
ment of MCPC603, a mouse myeloma pro- 
tein that binds phosphorylcholine (9). De- 
sign of the polypeptide linker was initiated 
by selecting specific amino acids, one near 
the carboxyl terminus of the Vl sequence 
and one near the amino terminus of the V H 
sequence. A computer program was then 



Fig. 2. The 3.C2/59 gene was constructed for 
easy insertion .into an E, coli expression vector 
such that an ATG codon is placed direcdy in 
front of the first codon of the light chain se- 
quence. Expression is controlled by the hybrid \ 
phage promoter (18). The resulting strain con- 
taining the CI857 temperature-labile repressor 
gene and the expression plasmid was induced by 
raising the culture temperature to 42°C. Over- 
night growth of the expression strain GX6539 
at 42°C resulted in the production of 3C2/59 
protein at greater than 10% of the total cell 
protein. Cells that had been induced in the pres- 
ence of [ 35 S] methionine were disrupted by two 
passes through a French Pressure Cell at 1600 
psi. The crude inclusion bodies were recovered 
by differential centrifugation, dissolved in 50 



used to search libraries of three-dimensional 
peptide structures derived from the Brook- 
haven Protein Data Bank for peptides of the 
proper molecular dimensions to span the 
distance in space between the selected amino 
acids. The number of potential peptides was 
reduced by specifying that the angle of the 
peptide bonds at the ends of the linker 
peptide match the angles of the bonds at the 
selected amino acids on either side of the 
linker. Structures of the remaining potential 
linker peptides were superimposed onto the 
MCPC603 variable region structure by 
computer graphics. The peptides that had a 
conformation that interfered with the Fv 
structure were discarded. This process was 
repeated with different amino acids as linker 
attachment sites to identify a number of 
peptides that could be used to link the V L 
and Vh regions of antibodies to create sin- 
gle-chain antigen-binding proteins. The 
linker in single-chain antigen-binding pro- 
teins to bovine growth hormone (BGH), 
3C2/59, and to fluorescein, 18-2-3/59, was 
designed by this method. A model of the 
three-dimensional structure of 3C2/59 pro- 
tein is shown in Fig. 1. 

Additional linkers were designed by the 
incremental addition of single amino acids 
or short peptides extending from the car- 
boxyl terminus of the V L to the amino 
terminus of the V H . Some linkers were 
designed to minimize interactions with the 
Fv and others were designed to fit into a 
groove on the back of the Fv structure with 
the use primarily of alternating glycine and 
serine residues with glutamic acid and lysine 
residues included to enhance solubility. The 
linkers in the single-chain antigen-binding 
proteins to fluorescein 18-2-3/202 and 4-4- 
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mM glycine, pH 10.8, 9M urea, 1 mM EDTA, and 20 mM p-mercaptoethanol (19), and diluted to 
a final protein concentration of -100 M-g/ml in the same buffer. The diluted protein was renatured 
as described by Boss et al. (19). The renatured protein in phosphate- buffered saline (PBS) was 
bound to a BGH-Sepharose column and eluted with 3M sodium thiocyanate. After dialysis against 
PBS, 90% of the 3C2/59 protein bound to a second BGH-Sepharose column. We performed 
competition assays using 8 x 10~ 7 Af 3C2/59 protein, BGH-Sepharose beads, and varying concen- 
trations of Fab derived from the monoclonal antibody in a total volume of 175 jxl. After washing, 
we counted individual wells containing the Sepharose beads in a Beckman liquid scintillation 
counter to quantitate binding. The results from two experiments are shown by the different 
svmbols. 



20/202' were designed in this way. Linker 
202 has the sequence EGKSSGSGSESKST. 
In the 4-4-20/202' protein, an additional 
amino acid, Gin, is present between the 
linker and the V H . 

The first single-chain antigen-binding 
protein genes were constructed from the V L 
and Vh sequences of 3C2, an immunoglob- 
ulin Gl (IgGl) monoclonal antibody to 
BGH (anti-BGH). Five versions of an anti- 
BGH single-chain gene containing different 
linkers were constructed and expressed in 
Escherichia coli. The resulting proteins were 
purified, renatured, and tested for their abil- 
ity to bind to BGH immobilized on nitro- 
cellulose or covalently linked to Sepharose. 
The anti-BGH single-chain antigen- binding 
protein containing a linker designated 59 
(3C2/59) was chosen for further study 
based on its affinity for BGH-Sepharose. 

When the 3C2/59 gene was expressed in 
E. coli, the single-chain protein accumulated 
in insoluble inclusion bodies. The 3C2/59 
protein had an apparent molecular size of 26 
kD as determined by SDS-polyacrylamide 
gel electrophoresis (SDS-PAGE). This com- 
pares favorably with the molecular size of 
26,652 daltons calculated from the amino 
acid sequence. The 3C2/59 protein cross- 
reacted on immunoblots with antiserum 
prepared against purified 3C2 light chain. 

The anti-BGH 3C2/59 protein was rena- 
tured and purified by affinity chromatogra- 
phy on BGH-Sepharose. The affinity-puri- 
fied protein migrated as a single band of 26 
kD when analyzed by SDS-PAGE under 
both reduced and nonreduced conditions. 
The amount of protein that was successfully 
folded and therefore able to bind to a BGH- 
Sepharose column varied between 5% and 
30% in different experiments. To demon- 
strate that the affinity- purified protein re- 
tained binding activity after thiocyanate elu- 
tion, the protein was analyzed again by 
chromatography on a second BGH-Sepha- 
rose column. More than 90% of this protein 
bound to BGH-Sepharose and was eluted 
with thiocyanate. Renatured single-chain 
protein produced from a modified 3C2/59 
gene in which the sequence of five of the six 
hypervariable regions had been changed did 
not bind to BGH-Sepharose, demonstrating 
that binding of 3C2/59 protein to BGH 
occurs at the antigen- binding site. 

The relative affinity of the purified 3C2/ 
59 protein for BGH was determined by 
competition with Fab fragments isolated 
from the 3C2 monoclonal antibody. In- 
creasing amounts of unlabeled Fab were 
rnixed with [ 35 S] methionine- labeled 3C2/ 
59 protein, and the mixture was incubated 
with BGH-Sepharose. After incubation, the 
amount of bound labeled protein was deter- 
mined. The competition curves for two ex- 
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periments are shown in Fig. 2. The concen- 
tration of Fab that inhibited binding of 
3C2/59 protein to BGH by 50% was 
2 x 10" 7 M, one-fourth the concentration 
of the labeled protein (8 x 10~ 7 M), indicat- 
ing that the equilibrium constant (iC a ) of the 
3C2/59 single-chain protein was within a 
factor of 4 of the K a of the Fab. 

Monoclonal antibodies to fluorescein 
were chosen for continued development of 
the single-chain antigen- binding protein 
technology because this antibody-antigen 
system has been well characterized (10). 



A 6.0 



^ 4.0- 




B 0:3 



o 




Fig. 3. Analysis of fluorescein binding affinities 
of the 4-4-20 Fab (A) and 4-4-20/202' protein 
(B) by the use of a fluorescence quenching assay 
(il). The r represents the fraction of Fab or 4- 
4-20/202' protein bound, and c represents the 
concentration of unbound fluorescein. A sample 
(590 yi) was placed in a 0.5-cm path-length 
cell, and 10-pd portions of 3.0 x 10~ 7 M fluo- 
rescein were -added to each sample and to buffer 
specimens (controls); the fluorescence was mea- 
sured with a Perkin-Elmer LS-5 fluorescence 
spectrophotometer. Quenching maxima (previ- 
ously determined) of 0.965 and 0.970 were 
used for the 4-4-20 Fab and 4-4-20/202' pro- 
tein analysis, respectively. We calculated protein 
concentrations from absorbance at 280 nm us- 
ing calculated molar extinction coefficients at 
280 nm {76,182 and 51,267m -1 cm" 1 for the 
4-4-20 Fab and 4-4-20/202' protein, respective- 
ly, based on the Trp and Tyr content of each 
protein sequence). 
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Several monoclonal antibodies with high 
affinity for fluorescein have been isolated, 
and quantitative assays for equilibrium con- 
stant determinations based on the quench- 
ing of fluorescein fluorescence have been 
described (11). Single-chain antigen-bind- 
ing protein genes were constructed from the 
sequences of the variable domains of two 
different monoclonal antibodies to fluores- 
cein: 18-2-3, an IgM (12), and 4-4-20, in 
IgG2aa3). 

Binding measurements for each sample of 
anti-fluorescein Ig, Fab, or single-chain anti- 
gen-binding protein were performed by flu- 
orescence quenching assays (11). An initial 
titration of the protein with fluorescein was 
used to determined the fluorescence quench- 
ing maximum (Q ma x)- An estimate of the K a 
was then calculated from this titration to 
select protein and fluorescein concentrations 
for accurate determination of the K a . For 
the monoclonal antibody 4-4-20, the report- 
ed values for Q max and K a are 96.4% and 
1.7 x 10 10 M _1 , respectively (13). 

The 4-4-20/202' protein, renatured from 
inclusion bodies, was concentrated, dialyzed 
against 20 mM tris, pH 8.5, loaded onto a 
DEAE-SPW high-performance liquid chro- 
matography anion exchange column (LKB), 
and eluted with a linear gradient of 0 to 2M 
sodium acetate in 20 mM tris, pH 8.5. 
Binding data for samples of 4-4-20 Fabs and 
4-4-20/202' protein are shown in Scatchard 
plots (14) in Fig. 3, A and B, respectively. 
The iC a s for the 4-4-20 Fab and the 4-4-20/ 
202' protein, calculated from the slopes in 
Fig. 3, are 8.0 x 10 9 M _1 and 1.1 x 
10 9 M _1 , respectively. The number of bind- 
ing sites per molecule was determined from 
the x-intercept. Values of 0.68 and 0.27 
indicate the fractions of active Fab and 4-4- 
20/202' protein in the preparations under 
these conditions. Thus, the refolding and 
nonaffinity purification used for the 4-4-20/ 
202' protein produced 27% active protein 
with high affinity (15). 

Kranz et al. (13) reported that the absor- 
bance spectrum of fluorescein was shifted 
from a maximum at 493 nm to a maximum 
at 505 nm when the fluorescein was bound 
by the 4-4-20 monoclonal antibody. This 
shift ranged from about 500 to 525 nm for 
different antibodies to fluorescein. To test 
whether this shift occurred with fluorescein 
bound to a single-chain antigen- binding 
protein, we measured the fluorescence exci- 
tation spectra of the 4-4-20 monoclonal 
antibody, the Fab fragment prepared from 
this antibody, and the 4-4-20/202' protein 
(Fig. 4). The 4-4-20/202' protein caused a 
similar shift in excitation maximum from 
493 to 505 nm as the monoclonal antibody 
and Fab, demonstrating that fluorescein was 
bound by the 4-4-20/202' protein in a 



manner analogous to the monoclonal anti- 
body. 

The 18-2-3/59 and 18-2-3/202' proteins 
also quenched the fluorescence of fluoresce- 
in. The Q m ax of these proteins was equiva- 
lent to that of the 18-2-3 monoclonal anti- 
body. Absolute JC a 5 s were not determined 
for these proteins. 

In summary, single-chain antigen- binding 
proteins were constructed from the variable 
region sequences of three different monoclo- 
nal antibodies and various linker peptides 
designed by computational methods and 
computer graphics. These proteins retain 
both the affinity and specificity of the start- 
ing monoclonal antibodies. The linker pep- 
tide that worked in one single-chain anti- 
gen-binding protein was also successful in a 
different protein, as in the anti-BGH 3C2/ 
59 and antifluorescein 18-2-3/59 proteins. 
We are confident that we can produce active 
single-chain antigen- binding proteins with 
the sequence of any monoclonal antibody. 

Winter and colleagues (16) showed that 
the specificity of a monoclonal antibody can 
be changed by substituting the hypervaria- 
ble sequences of a mouse antibody into a 
human framework. It should also be possi- 
ble to change the specificity of a single-chain 
antigen- binding protein by replacing the 
hypervariable sequences with those from a 




n ; 1— ' — i 1 r~ ' — f 1 ■ r 

485 495 505 485 495 505 485 495 505 

Wavelength (nm) 

Fig. 4. The fluorescence excitation spectra were 
obtained on a Perkin-Elmer LS-5 fluorescence 
spectrometer connected to an R100 recorder 
and were determined at 2°C with 5-mm path- 
length microcuvettes in a thermostatted holder. 
Readings were obtained for samples (600 ul) of 
5 x 10" 9 M free fluorescein (left peak in all pan- 
els) and the same concentration of fluorescein in 
the presence of (A) 10 u.g/ml 4-4-20 monoclo- 
nal antibody, (B) 10 o-g/ml 4-4-20 Fab frag- 
ment, and (C) 0.1 p.g/ml 4-4-20/202' protein. 
The fluorescence emission was monitored at 
530 nm while the excitation wavelength was 
varied from 470 to 515 nm. The nominal ab- 
sorption maximum of unbound, unquenched 
fluorescein is about 493 nm [left peak in (A) to 
(C)] and was monitored on a scale five times 
that of the protein- bound, quenched fluorescein 
samples (right peak). 
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monoclonal antibody having a different 
* specificity. A logical step in the development 
of a clinically useful antitumor agent would 
be to insert the hypervariable sequences 
from antitumor antibodies generated in 
mice into a single-chain antigen-binding 
protein derived from human framework re- 
gions. 

Single-chain antigen- binding proteins are 
expected to have advantages in clinical appli- 
cations because of their small size. These 
proteins should be cleared from serum faster 
than monoclonal antibodies or Fab frag- 
ments. Because they lack the Fc portion of 
an antibody, which is recognized by ceil 
receptors, they should have a lower back- 
ground in imaging applications and they 
should be less immunogenic. They may pen- 
etrate the microcirculation surrounding sol- 
id tumors better than monoclonal antibod- 
ies. We foresee the use of single-chain anti- 
gen-binding proteins in applications for 
which monoclonal antibodies and antibody 
fragments are currently used, such as (i) 
imaging and therapy of cancers and cardio- 
vascular or other diseases, (ii) separations, 
and (iii) biosensors. 

The peptide linkers present in single-chain 
antigen-binding proteins can be designed 
with specialized function such as sites for the 
chelation of metals or the attachment of 
drugs or toxins for applications in imaging 
and therapy. In addition, it will be possible 
to design sequences into the linkers or at the 
carboxyl terminus for the attachment of the 
protein to solid supports for use in clinical 
assays, separations, and sensing devices. 

Note added in proof. Since this report was 
submitted another paper dealing with simi- 
lar technology has appeared (17). 
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Kaposi's Sarcoma Cells: Long-Term Culture with Growth 
Factor from Retrovirus-Infected CD4 + T Cells 

Shuji Nakamura, S. Zaki Salahuddin, Peter Biberfeld, 
Barbara Ensoli, Phillip D. Markham, Flossie Wong-Staal, 
Robert C Gallo* 

Studies of the biology and pathogenesis of Kaposi's sarcoma (KS) have been hampered 
by the inability to maintain long-term cultures of KS cells in vitro. In this study AIDS- 
KS-derived cells with characteristic spindle-like morphology were cultured with a 
growth factor (or factors) released by CD4 + T lymphocytes infected with human T- 
lymphotropic virus type I or II (HTLV-I or HTLV-II) or with human immunodefi- 
ciency virus type 1 or 2 (HIV-1 or HIV-2). Medium conditioned by HTLV-II- 
infected, transformed lines of T cells (HTLV-II CM) contained large amounts of this 
growth activity and also supported the temporary growth of normal vascular endothe- 
lial cells, but not fibroblasts. Interleukin-I and tumor necrosis factor— a stimulated the 
growth of the KS-derived cells, but the growth was only transient and these factors 
could be distinguished from that in HTLV-II CM. Other known endothelial cell 
growth promoting factors, such as acidic and basic fibroblast growth factors and 
epidermal growth factor, did not support the long-term growth of the AIDS-KS cells. 
The factor released by CD4 + T cells infected with human retroviruses should prove 
useful in studies of the pathogenesis of KS. 



Kaposi's sarcoma develops in 
the form of multifocal lesions con- 
sisting of characteristic spindle- 
shaped cells in a stroma of proliferating 
abnormal vessels, fibroblasts, and infiltrat- 
ing leukocytes. An indolent form of KS 
occurs in elderly males in Mediterranean and 
African countries (1 } 2) and a more aggres- 
sive, glandular form of the disease occurs in 
younger Africans. An aggressive form of KS 
is also associated with HIV-1 infection, 
primarily in homosexual men (3, 4), and 
with immune suppression due to other 
causes (5). A direct transforming involve- 
ment of HIV-1 in the development of 
AIDS-associated KS (AIDS-KS) is unlikely 
because genomic sequences of the virus have 
not been detected in KS tissues (6). Further- 
more, no other viruses, environmental fac- 
tors, or genetic factors have been convinc- 



ingly linked to any form of KS (4, 6 } 7). To 
gain insight into the nature of KS cells and 
to search for possible new etiological agents, 
we developed procedures for establishing 
cells from AIDS-KS in culture. 

A number of growth factors, including 
endothelial cell growth supplement (ECGS) 
and fibroblast growth factors (FGF), that 
were previously shown to stimulate or sup- 
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